89 research outputs found
Covariate-adaptive randomization inference in matched designs
It is common to conduct causal inference in matched observational studies by
proceeding as though treatment assignments within matched sets are assigned
uniformly at random and using this distribution as the basis for inference.
This approach ignores observed discrepancies in matched sets that may be
consequential for the distribution of treatment, which are succinctly captured
by within-set differences in the propensity score. We address this problem via
covariate-adaptive randomization inference, which modifies the permutation
probabilities to vary with estimated propensity score discrepancies and avoids
requirements to exclude matched pairs or model an outcome variable. We show
that the test achieves type I error control arbitrarily close to the nominal
level when large samples are available for propensity score estimation. We
characterize the large-sample behavior of the new randomization test for a
difference-in-means estimator of a constant additive effect. We also show that
existing methods of sensitivity analysis generalize effectively to
covariate-adaptive randomization inference. Finally, we evaluate the empirical
value of covariate-adaptive randomization procedures via comparisons to
traditional uniform inference in matched designs with and without propensity
score calipers and regression adjustment using simulations and analyses of
genetic damage among welders and right-heart catheterization in surgical
patients.Comment: 41 pages, 8 figure
Variance-based sensitivity analysis for weighting estimators result in more informative bounds
Weighting methods are popular tools for estimating causal effects; assessing
their robustness under unobserved confounding is important in practice. In the
following paper, we introduce a new set of sensitivity models called
"variance-based sensitivity models". Variance-based sensitivity models
characterize the bias from omitting a confounder by bounding the distributional
differences that arise in the weights from omitting a confounder, with several
notable innovations over existing approaches. First, the variance-based
sensitivity models can be parameterized with respect to a simple
parameter that is both standardized and bounded. We introduce a formal
benchmarking procedure that allows researchers to use observed covariates to
reason about plausible parameter values in an interpretable and transparent
way. Second, we show that researchers can estimate valid confidence intervals
under a set of variance-based sensitivity models, and provide extensions for
researchers to incorporate their substantive knowledge about the confounder to
help tighten the intervals. Last, we highlight the connection between our
proposed approach and existing sensitivity analyses, and demonstrate both,
empirically and theoretically, that variance-based sensitivity models can
provide improvements on both the stability and tightness of the estimated
confidence intervals over existing methods. We illustrate our proposed approach
on a study examining blood mercury levels using the National Health and
Nutrition Examination Survey (NHANES)
An Exact Test of Fit for the Gaussian Linear Model using Optimal Nonbipartite Matching
Fisher tested the fit of Gaussian linear models using replicated observations. We refine this method by (1) constructing near-replicates using an optimal nonbipartite matching and (2) defining a distance that focuses on predictors important to the model’s predictions. Near-replicates may not exist unless the predictor set is low-dimensional; the test addresses dimensionality by betting that model failures involve a subset of predictors important in the old fit. Despite using the old fit to pair observations, the test has exactly its stated level under the null hypothesis. Simulations show the test has reasonable power even when many spurious predictors are present
Constructed Second Control Groups and Attenuation of Unmeasured Biases
The informal folklore of observational studies claims that if an irrelevant observed covariate is left uncontrolled, say unmatched, then it will influence treatment assignment in haphazard ways, thereby diminishing the biases from unmeasured covariates. We prove a result along these lines: it is true, in a certain sense, to a limited degree, under certain conditions. Alas, the conditions are neither inconsequential nor easy to check in empirical work; indeed, they are often dubious, more often implausible. We suggest the result is most useful in the computerized construction of a second control group, where the investigator can see more in available data without necessarily believing the required conditions. One of the two control groups controls for the possibly irrelevant observed covariate, the other control group either leaves it uncontrolled or forces separation; therefore, the investigator views one situation from two angles under different assumptions. A pair of sensitivity analyses for the two control groups is coordinated by a weighted Holm or recycling procedure built around the possibility of slight attenuation of bias in one control group. Issues are illustrated using an observational study of the possible effects of cigarette smoking as a cause of increased homocysteine levels, a risk factor for cardiovascular disease. Supplementary materials for this article are available online
Large, Sparse Optimal Matching with Refined Covariate Balance in an Observational Study of the Health Outcomes Produced by New Surgeons
Every newly trained surgeon performs her first unsupervised operation. How do the health outcomes of her patients compare with the patients of experienced surgeons? Using data from 498 hospitals, we compare 1252 pairs comprised of a new surgeon and an experienced surgeon working at the same hospital. We introduce a new form of matching that matches patients of each new surgeon to patients of an otherwise similar experienced surgeon at the same hospital, perfectly balancing 176 surgical procedures and closely balancing a total of 2.9 million categories of patients; additionally, the individual patient pairs are as close as possible. A new goal for matching is introduced, called refined covariate balance, in which a sequence of nested, ever more refined, nominal covariates is balanced as closely as possible, emphasizing the first or coarsest covariate in that sequence. A new algorithm for matching is proposed and the main new results prove that the algorithm finds the closest match in terms of the total within-pair covariate distances among all matches that achieve refined covariate balance. Unlike previous approaches to forcing balance on covariates, the new algorithm creates multiple paths to a match in a network, where paths that introduce imbalances are penalized and hence avoided to the extent possible. The algorithm exploits a sparse network to quickly optimize a match that is about two orders of magnitude larger than is typical in statistical matching problems, thereby permitting much more extensive use of fine and near-fine balance constraints. The match was constructed in a few minutes using a network optimization algorithm implemented in R. An R package called rcbalance implementing the method is available from CRAN
Interpretable Sensitivity Analysis for Balancing Weights
Assessing sensitivity to unmeasured confounding is an important step in
observational studies, which typically estimate effects under the assumption
that all confounders are measured. In this paper, we develop a sensitivity
analysis framework for balancing weights estimators, an increasingly popular
approach that solves an optimization problem to obtain weights that directly
minimizes covariate imbalance. In particular, we adapt a sensitivity analysis
framework using the percentile bootstrap for a broad class of balancing weights
estimators. We prove that the percentile bootstrap procedure can, with only
minor modifications, yield valid confidence intervals for causal effects under
restrictions on the level of unmeasured confounding. We also propose an
amplification to allow for interpretable sensitivity parameters in the
balancing weights framework. We illustrate our method through extensive real
data examples
- …